SINGULAR CONTINUATION: GENERATING PIECE WISE LINEAR 
APPROXIMATIONS TO PARETO SETS VIA GLOBAL ANALYSIS 
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Abstract. We propose a strategy for approximating Pareto optimal sets based on the global 
analysis framework proposed by Smale (Dynamical systems, New York, 1973, pp. 531—544). The 
method highlights and exploits the underlying manifold structure of the Pareto sets, approximating 
Pareto optima by means of simplicial complexes. The method distinguishes the hierarchy between 
singular set, Pareto critical set and stable Pareto critical set, and can handle the problem of super- 
position of local Pareto fronts, occurring in the general nonconvex case. Furthermore, a quadratic 
convergence result in a suitable set-wise sense is proven and tested in a number of numerical exam- 
ples. 
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1. Introduction. 

1.1. Multiobjective optimization and Pareto optimality. Multiobjective 
optimization is concerned with the problem of optimizing several functions (or ob- 
jectives) simultaneously. A precise mathematical statement in an economical frame- 



work was first given by V. Pareto 36 37] in the 1880 's. In recent years a strong 
interest has grown, as a variety of problems in structural mechanics, automotive, 
aerospace, production planning, environmental policy and many others, involve more 
than one objective function and different numerical strategies have been developed 



subsequently 32 



In the single objective case, an optimum is defined as a point x £ W C K n 
where a given function u : W — > K assumes its maximum, if the maximum exists. In 
multiobjective optimization we consider two or more functions, m, ■ ■ ■ , u m : W — > K, 
and in all the non trivial cases the optima for one function are distinct from the 
optima of the remaining ones. A key point is that not only one has to consider the 
optima of the individual functions, which usually are finite, but also one usually finds 
an infinite number of so-called non-dominated points. They are defined precisely as 
follows. 

Definition 1 (Pareto optimality). Let W be an open subset ofW 1 , or an Tri- 
dimensional manifold, and let u\ , . . . , u m : W — > K be smooth function^ A point 
x G W is called a non-dominated point, or a Pareto optimum, if there is no x G W 
such that Ui(x) ^ Uj(x) for all i = 1, . . . ,m and Uj{x) > Uj(x) for some j. If there 
exists a neighborhood V C W of X where X is Pareto optimum, then x is called a local 
Pareto optimum. 

1.2. The necessity for global representations of the Pareto sets. As 

pointed out for instance in 19] , the set of Pareto optima is in many cases a large and 
complicated nonconvex set and most of the existing algorithms, being inspired by local 
search ideas from traditional linear and nonlinear programming, fail at giving a truly 



global representation of the set of Pareto optima. Also 11 stresses that "a whole 
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1 We may equivalently refer to a unique smooth vector function u : W —> R m , or mapping. 
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collection of Pareto optimal points, representative of the entire spectrum of efficient 
solutions" would be helpful in facilitating design in engineering applications. 

Recent multiobjective optimization literature tackled this issue focussing on defin- 



ing algorithms producing even distributions of Pareto points 11 29 30.62 while an 



alternative philosophy 41-43 49 dealt with producing local meshes approximating 



Pareto sets, relying on continuation (homotopy) strategies. In the recent paper 38 



both topics are addressed. Alternative techniques aiming to approximate the entire 



optimal set are described in the recent papers [18 27 , in the survey 44 and in the 
references therein. 

We want here to highlight a key feature of the Pareto set which makes it, in 
general nonconvex cases, a complicated set. Its complexity is even amplified when the 
Pareto set is viewed in the output space. 
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Fig. 1.1. Possible problems arising when the objectives are nonconvex functions, (a) The 
Pareto set is composed by separate branches, (b) A connected global Pareto set is composed by 
separate local branches crossing each other. 



Indeed, the set of all global Pareto optima can be disconnected, i.e., composed 
by separate portions of seemingly smooth surfaces (see figure [l~l| a)). Furthermore, 
even when the image of the set of global Pareto optima is a connected set, in fact it 
could be composed by cutting and sewing together different locally optimal branches 
(see figure |l"T| (b)), coming from separate zones of the domain. We will illustrate 
in the sequel that this kind of behavior is not an artifact obtained with unrealistic 
functions but in a sense represents a typical situation that is not destroyed by slight 
deformations of the functions. Those situations are persistent, or more technically, 
structurally stable. 

We notice that the algorithms mentioned above are expected to work properly 
only in a local sense, although they are intended to capture some of the global features 
of the optimal set. Moreover, apart from the homotopy techniques, they are point-wise 
strategies, in the sense that as Pareto optimal set they produce a scatter of points; the 
evenness of the distribution of points is then estimated on the image space. In some 
applications those points are joined together in a compound structure, e.g., a Delaunay 
triangulation, but only a posteriori and in the output space. It should be noticed that 
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because of the effects of mappings described above, defining such a structure from the 
output space, i.e., joining nearby values of optimal points, is subject to failure: the 
corresponding preimage points, indeed, are not necessarily nearby, or even connected 
in the way suggested by the positions of their images. Conversely, the images of 
nearby optimal points are nearby, because of continuity. 

We propose instead that a faithful global representation of the Pareto set in gen- 
eral nonconvex settings is obtained according to the following three steps. Firstly, 
by shifting the focus from the output space to the input space, secondly by approx- 
imating the full set of the local Pareto optima, and thirdly by adopting a set-wise 
standpoint, namely using compound geometrical objects as simplicial complexes in- 
stead of scatters of points. The first step unfolds the singularities (branches crossing, 
cusp points and so on) occurring as an effect of the mapping. Indeed, as illustrated 
in the sequel, the preimage of the Pareto set is non-singular, as it exhibits in general 
a regular manifold structure. The second step, because of possible superpositions 
of local branches, guarantees that every portion of the global Pareto optimal set is 
represented. Thirdly, simplicial complexes, i.e., meshes, faithfully reflect the mani- 
fold structures and explicitly offer the desired parametrization for each portion of the 
Pareto set, allowing to perform "trade-off studies" among the conflicting objectives. 
Indeed, trade-off studies may be the application of greatest practical importance of 
multiobjective optimization. Nevertheless, from the above discussion it is clear that 
trying to track the surface of the Pareto set by picking points from the output space, 
as point -wise strategies are aimed to do, is supposed to work correctly only for limited 
intervals. 

There are at least two reasons why our program has not yet been pursued in its 
entirety. First of all, in a number of situations, the numerical techniques available in 
literature are able to build sufficiently faithful representations of the Pareto set. E.g., 
when the functions at hand are convex, or relatively simple, or when the singularities 
are situated far away from interesting zones, a global investigation of the problem is 
not required. Moreover, typically, trade studies are performed in the neighborhood of 
a previously determined solution, therefore they can be limited to a non problematic 
branch of the Pareto set giving back as well the important information. Secondly, it 
is clear that a global exploration of the domain is a demanding task which could be 
far out of the scopes of a typical design problem. 

Nevertheless, faithful global representations of the Pareto set are a worthy goal to 
pursue, because they complement existing local exploitation strategies in two senses: 
they resolve the above mentioned problematic superpositions and they facilitate the 
location of important zones, which could merit further investigation. It is clear that 
this kind of program has to be implemented in an efficient way in order to be real- 
istically useful in applications. On the other hand, even a roughly sketched global 
picture of the whole situation can give crucial information on the problem at hand, 
suggesting correctly the location of paramount zones. 



1.3. Global analysis and multiobjective nonlinear programming. With 
this in mind, we have devised a novel numerical strategy for approximating Pareto 
sets, theoretically based on the global analysis ^framework established by S. Smale and 



others in the early 1970's 12 13115 16,55 56 63-65 and in more recent work 33 34 



2 See |54| . For brevity, we speak a bit loosely of global analysis also when referring to concepts 
of singularity theory or differential topology. 
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Motivated by his discussions with G. Debreij^] Smale investigated the problem of 
optimizing several functions within the dynamical systems arena. In the series of 
works that followed there emerged interesting topological and geometrical features of 
the sets of the Pareto optima. The notion of Pareto critical set 9, generalizing the 
concept of critical point for scalar functions, was introduced and furthermore, local 
Pareto optima were characterized by means of first and second derivatives. Quoting 
Smale [55] : 

"We study the local and global nature of 6, as one uses freshman 
calculus to study the maximum of a single function. " 
One of the basic facts highlighted in Smale's global analysis framework, is that 
under the assumptions of second order differentiability and some generic transversality 
condition, Pareto optimal sets are portions of (m — 1) -dimensional manifolds. It is 
fundamental that a slight deformation of the functions do not alter substantially 
the Pareto set. Global analysis is the proper setting where to study such kind of 
resilience properties. A mapping u : W — > R m is said structurally stablt^ if there 
exists a neighborhood N(u) in the C r topology such that every function u in N(u) 
is equivalent to it, i.e., there exist diffeomorphisms h,k, close to the identities of the 
respective spaces, such that the diagram: 



W — ^ R m 

h k 

W — ^ E m 



commutes. Clearly if two mappings are equivalent, their Pareto sets are diffcomorphic. 
One of the main results of global analysis is that there exists an open and dense set in 
C r (W, M m ) of structurally stable mappings^] In other words, the Pareto set of almost 
every mapping u is as close as desired to the Pareto set of any other mapping in a 
sufficiently small neighborhood of u. This is clearly of fundamental importance for 
the applications: when functions are known only with a certain approximation, as 
usual in engineering design problems, the set of optimal points is guaranteed to be 
approximated correctly by any convergent sequence of functions 7 , 54 . 60 1 p] Moreover, 
a generalization of Morse theory for several functions can be defined |55||63| . 

The strategy presently proposed highlights and exploits the manifold structure 
underlying the Pareto sets and precisely reproduces the hierarchy, described in Smale's 
work, among singular set, Pareto critical set and stable Pareto critical set. These 
sets are approximated by means of simplicial complexes, and exploiting Newton-type 
estimates it is possible to prove quadratic precision in set-wise sense, adopting the 
Hausdorff measure. Because of this result the present method can be considered as a 



set-wise variant of multiobjcctive Newton methods, as 17 



3 Dcbreu won the Nobel prize for Economics in 1983 "for having incorporated new analytical 
methods into economic theory and for his rigorous reformulation of the theory of general equilib- 



15 16 



bil 



rium". For an account of the cooperation between Smale and Dcbreu, sec 

4 To be precise, we should speak of stability of mappings, while structural stability is more often 
used when speaking about differential equations. On the other hand we must speak about stability 
of Pareto optima, which is instead a concept deriving from the study of stability of equilibra, and 
refers to critical points which are maxima. Therefore we will keep speaking of structural stability 
when dealing with typical singularities of mappings. 

5 It is necessary that m < 7 and n ^ 8, or m < 6 and n = 8 [28] . 

6 The original idea of structural stabil ity is a joint work from an engineer, A. Andronov, and a 
mathematician, L. Pontryagin, see [53p4] . 
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The algorithms of this paper can also be considered as a globalization and gen- 
eralization to more than two objectives of the homotopy techniques, while the use 
of tessellations can be thought of as the specialization of the techniques of simplicial 
pivoting (3H5] to the problem of optimizing several functions. A strong similarity 



can be found in the method proposed by 49 , where the authors detect and pro- 
gressively refine the hypercubes containing the Pareto sets, relying on the standard 
Karush-Kuhn-Tucker conditions, instead of Pareto criticality. 

2. The global analysis framework. We recall now Smale's definitions and 
results. Let W C R d be an open set or more generally a smooth n-dimensional 
manifold, u : W — > K m a smooth vector function, with m ^ n. Q The singular set 
X C W , is the collection of singular points, i.e., the points where the rank of the 
Jacobian Du(x) is non maximal. If m = 1, the singular set coincides with the set of 
critical, or stationary, points, i.e, Du{x) = 0. It can be proved that under generic 
conditions the singular set is a smooth manifold. 

Let Pos be the open positive cone in R m , Pos := |?/ 6 R m yj > 0,Vj = 1, ... ,m\, 

and let C x the corresponding open cone in the tangent space T X W, C x := Du^ 1 (Pos). 
Definition 2 (Pareto critical set 9). The set 



C x = %\, (2.1) 



is called the Pareto critical set. 

We characterize 9 in terms of the Jacobian of u. 

Proposition 3 (First order proposition). Let x e W . Then, x € 9 if and only 

if: 

(a) {Duj(x)} =1 do not belong to a unique open half space of the cotangent space 
T*W. 

(b) 3 Xj ^ 0, j = 1, . . . ,m, not all zero such that ■ XjDuj(x) = 0. 

Remark 4- The meaning of Proposition^ is to exclude at the first order, for x to 
be critical, the existence of paths along which all the objectives Uj can be incremented 
at the same time. If there were an open half space containing all Duj , as in condition 
(a), any direction in this half space would be a direction of improvement for every Uj. 
Equivalently, condition (b) states that the gradients Duj should be linear dependent 
and furthermore should "oppose " each other. In other words, moving in the direction 
of maximal increasing according to one of the Uj causes one or more of the remaining 
Ui to strictly decrease. 

Remark 5. In the bi-objective case, m = 2, Proposition^ states that in Pareto 
critical points the two gradients are collinear and in opposition to each other. Also 
critical points for one of the two objectives are Pareto critical. 

In analogy with freshman calculus, (Pareto) criticality is a necessary condition for 
x to be optimal. In order to discriminate the nature of the Pareto critical points we 
introduce a notion of stability and point out its relation with the second derivatives 
of u. This will give sufficient conditions for x to be Pareto optimal. 

Definition 6. A curve (a, 6) 3 t h-> ip{t) € W is said to be admissible if 

j t Ui {<p{t))> 0, t£(a,b), Vi = l,...,m. (2.2) 



7 The case m < n is less frequent. We will consider some aspect of this case in the sequel. 
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Clearly, if a point is Pareto critical, there could not exist admissible curves passing 
through it. In order to establish its optimality it is necessary to investigate the behav- 
ior of the admissible curves in a neighborhood of a critical point. Admissible curves 
are smooth paths along which every objective is incremented. Therefore, they move 
towards local Pareto optima; conversely, if a critical point captures all neighboring 
admissible curves, that point is a local Pareto optimum. 

Definition 7. A Pareto critical point x is said to be stable, x € 9s, if, given a 
neighborhood V x of x in W , there exists a neighborhood U x of x inV x , such that every 
admissible curve tp : [a,b) — > W, with <p{a) G U x satisfies Image((p) C V x . Pareto 
stability can be fully decidable by carefully examining the second derivatives of the 
objectives. In the single objective case, by virtue of the Morse's Lemma, it is possible 
to find a coordinate system where the objective u can be written as a quadratic 
polynomial u = ±x\ ± • • • ± x 2 d , which number of minus signs defines the Morse 
index, and therefore decides the nature of the critical point (maximum, minimum or 
saddle) [35]. With some effort, results can be extended to multiple objectives: second 
derivatives are not defined invariantly, but if we think about them as a symmetric 
bilinear form restricted to the kernel of the differential Du(x) assuming values on 
the cokernel M. m /lma,ge(Du(x)) , then this form is invariantly defined. It is called 



"2nd intrinsic derivative" (see [28] 40 ). The restriction to the kernel of the tangent 
map Du(x) has also the following meaning. By investigating the attractive/repulsive 
behavior of admissible curves in a neighborhood of a critical point, we will not be 
interested in what happens along the directions parallel to the critical set, while 
the orthogonal space will be the arena where the stability of the critical points will 
be decided. The case of greatest importance is where corank Du(x) is 1 (i.e., rank 
Du{x) is m - 1). In this case the second intrinsic derivative assumes values in a 1— 
dimensional vector space. If we consider x£0,we have lma,ge(Du(x)) Pos — 0, thus 
R m /Image(_Du(x)) has a canonical positive ray. We call the 2nd intrinsic derivative, 
in this case, the generalized Hessian H x . It makes sense to say that H x is negative 
definite or positive definite, as well as to define an index, as the index of the symmetric 
form H x . We set 

d6={xe0 Image(Z>u(aO) n {Cl(Pos) \ {0}} ^ 0| (2.3) 

where Cl(Pos) is the closure of Pos. 

Proposition 8 (2nd order Proposition). Let u : W — » E m a smooth map with 
x G 6, x 86 and corank Du(x) = 1. Then 

(a) if the generalized Hessian H x is negative definite, then x € 9s- 

(b) Let Xj ^ 0, j = 1, . . . ,m be as in the 1st order proposition; then (up to a positive 
scalar) 



H x =y.X j D^u j (x), on ker Du{x). (2.4) 

3=1 

The proposition is proved in [56] , while a discussion of the genericity of the hy- 



potheses on the rank assumption is given in 55 . 

Most importantly, Proposition [8] offers a useful and workable criterion for decid- 
ing the stability of critical points. We will translate numerically this proposition in 
Algorithm [2] 

2.1. The structure of Pareto sets. We start by recalling the notion of Thorn's 



stratification (see 58-61]) 
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Definition 9. Let A C W be a closed subset. A stratification S of A is a finite 
collection of connected submanifolds of W satisfying the following properties: 

(1) U SeS S = A. 

(2) If S £ S then OS = Cl(S) \ S is a union of elements of S of lower dimension. 

(3) If S £ S and U is a submanifold of W transversal to S at x £ S then U is 
transversal to all elements of S in a neighborhood of x. 

The following theorem has been proved in [12]. Consider the space C°°(W, M. m ) 
endowed with the C°° topology. W is a compact manifold with dimension n ^ m. 

Theorem 10 (9 is a stratified set of dimension m — 1). There is an open and 
dense set Q C C°° (W,R TO ) such that if u £ Q then 6 is a stratified set of dimension 
m — 1. 

Remark 11. If m > n, it is possible to prove that, for a generic mapping u, 9 is 
a stratified set of dimension n. 

From the point of view of the numerical applications, we state that in the generic 
case the strata of the Pareto critical set 9 can be discretized by means of a collection 
of (m — l)-dimensional meshes. Obviously we would like to refine this procedure 
to 9s- Unfortunately, the following conjecture has been proved only for m — 2,3 



(see 13 63]). 

Conjecture 12. There is an open and dense set Q C C°°(W,M. m ) such that if 
u £ Q then 9 is a stratified set and 9s is a union of strata. 

Remark 13. The stable Pareto critical set 9s is formed by all the local Pareto 
optimal points. The global Pareto optimal points cannot be distinguished from local 
optima by means of differential features as in the statements presented above. Global 
Pareto optima can only be filtered out a posteriori. 

3. Numerical translation of the global analysis approach. In the following 
sections we illustrate numerical methods for approximating Pareto sets on the basis 
of Propositions [3] and [8] The procedure is reminiscent of contour plot algorithms 
for plotting level sets of functions, and is a special instance of general strategies for 
piecewise-linear approximation algorithms for implicitly defined manifolds 1 1 , 3[[5][6] . 
The method determines a simplicial complex approximating the singular set E and 
then refines it to the critical set 9 and to the stable critical set 9 S . Because the strategy 
proposed consists in a continuation method focussed on the manifold structure of 
Pareto optima inherited by the singular set, we coined the term singular continuation. 

3.1. First order search algorithm. Algorithm[T]translates numerically Propo- 
sition [3] We start by considering a set of data points V — {Pi, . . . , Pjy} where we 
will evaluate the Jacobian J Ul then we build a Delaunay tessellation having T> as 
nodes0 We assume that the nodes P\, . . . ,P n are in general position, i.e., they give 
rise to a valid Delaunay tessellation. Better results are obtained if the simplexes are 
"round", i.e., they do not possess very thin or very large angles. Special tessella- 
tions, e.g., Freuenthal-Kuhn, simplify the operation of "pivoting" from a simplex to 
the adjacent, speeding up the process of glueing together the polytopes composing 
the implicitly defined manifold Hereafter, we also assume that the dataset is 
sufficiently dense to resolve all the feature of the singular manifold E. More pre- 
cisely, we assume that every connected component of E intersects at least one of the 



8 In the implementation considered in what follows we employed the qhull software 81, based on 
the computation of convex hulls, and, in the two dimensional examples, we employed the TRIANGLE 
software 51,52. For iterative schemes, a efficient alternative is offered by the Bowyer- Watson 
algorithm 10,66 , which is incremental. 
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(n — rn)-faces A of the tessellation, and the intersection is unique and transversal, 
i.e, diml^E © T X A = n = max. Doing so E is guaranteed to be homeomorphic to 
its piece-wise linear approximation. We denote by E, 9 and 9 s the portions of the 
singular set, critical set and stable critical set, respectively, which possibly are con- 
tained in a simplex A of the tessellation of the domain considered. Hatted symbols, 
E, 9 and 9s, denote the corresponding piecewise linear approximations. The details 
of the algorithm are discussed in subsection |3.2| 



Algorithm 1 First order algorithm for approximating the Pareto critical set 9 
1: Consider a set of data points V = {Pi, ... , Pat}; 
2: evaluate the gradients of the Uj on the data points; 
3: build a Delaunay tessellation on the nodes V; 

4: for all Delaunay simplex A = (Pi , . . . , Pi n ) in the tessellation do 

5: compute the (m — l)-polytope E where the 1st order approximation of the 

Jacobian of u vanishes; 
6: extract the sub-polytope 9 where the vanishing linear combination \iDu\ + 

• • • + \nDu m = has non negative coefficients; 
7: end for 

8: compose a simplicial complex glueing together adjacent polytopes 9. 



Remark 14- The algorithm assumes n ^ m. When m > n things extend quite 
easily, because the singular set is all of the input domain, and as recalled in Section 
\2.1\ the critical set is a stratified set. More precisely, the gradients are always linearly 
dependent, thus it is sufficient to skip step 5 of Algorithm^ 



3.2. Analysis of simplexes. We cycle through the tessellation simplexes A = 
(Pi t , . . . , Pi„ +1 ) and approximate the portion of the Pareto critical set 9 possibly 
contained in A. To determine the linear approximation 9 S of the stable Pareto critical 
portion 9 S n A we recall that 9 is contained in the singular set E, i.e., the set where 
the rank of the differential Du(x) is less than maximal: 



Os C 9 C E C W, (=> 9 S C 9 C E C A.) (3.1) 



Adjacent approximate portions 9 S are eventually sewed together. 



3.2.1. Singular set E. We fix a cell A := (Pi, . . . , P n +i)- The Jacobian is an 
n x to matrix which rank is non maximal on the singular set E. The rank of J u drops 
when the rows are linearly dependent, e.g., when a suitable subset of the m-order 
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minors are degenerate. We consider for instance the following submatricesj^] 



M = 





dui 




dx m 


dUm 






dx m 



Ma = 



' ! }X2 



dx 2 



Ml \ 



du m 
dx m+1 ) 



I dux 

dx n _ m+1 



dui 



du m . 

dx n / 



(3.2) 



We denote the number of minors by r :— n—m+1 and set CDj(x) := detMj(a;), for j 



1, . . . , r and consider all the (r + l)-faces of the cell A, i.e., for every {ii, 
{1, . . . , n + 1}, with i\ < %2 < ■ ■ ■ < i r +i, we consider the simplex 
( x Pi 1 , . . . , Pi r+1 ). The solution . . . , /V+i) of the system: 



Vfl} £ 



(J, r+1 U>i(P ir+1 ) 



(HUir+liPh) H \- (l r+1 (V r+1 (P ir+1 ) 

{(Jtl-\ h Hr+1 



(3.3) 



leads to a singular vertex 
(r 



: Mi 



Pi. 



■ ■ + n r+ iPi r+1 of E if all fij > 0, i.e., if Q is 
contained in the (r + l)-face of A considered. 

The (possibly empty) singular set E is an (m — l)-polytope defined as the convex 
hull of the singular vertices Q. 



3.2.2. Critical set 8. In the previous subsection we have detected the singular 
set E, on the basis of the fact that on the singular set the gradients are linearly depen- 
dent. On the other hand, on the critical set 9 there exists a positive linear combination 
of the gradients giving zero. Thus we proceed by estimating the coefficients Xj of the 
vanishing linear convex combination of the gradients, and cutting out the critical set 
9 from E by intersection with the half spaces where the linear interpolations of the 
As are positive. 

More precisely, we solve the system: 



X 1 Du 1 (P) + 
Ai + • • ■ + A r , 



X m Du m (P) = 0, 



(3.4) 



for Ai , . . . , A m . The Jacobian of u has rank m— 1 in almost all the points of the singular 



set (generic hypothesis), thus the system ( 3.4 ) has rank m, and by the implicit function 
theorem A^ are smooth functions of P. As a result the level sets {Xj(P) = 0}, which 
define the boundary of 9, are smooth manifolds. At the first order we are working 
with, the requests Xj(P) ^ cut out half spaces in E, defining possibly a critical sub 
polytope 9 in A. 

We notice that we do not know the actual values of Du on the singular vertices, 
i.e., the nodes of E. Nevertheless, we can estimate them by linearly interpolating 



9 Apart from degenerate cases, all m— minors share the same rank, so it is sufficient to consider 
only a selection of minors involving at least once each of the columns of the Jacobian. 
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the values of Du on the data nodes Pj 15 . . . ,P% r+1 defining the vertex Q in S. By 



taking the coefficients /ii,.. . , AV+i solving the system (3.3), we are guaranteed that 
the .Ditj (Q) := [i\Duj (P^ ) + •••+ n r+ \Duj (Pi r+1 ) are linearly dependent, and we are 

justified in solving for the vanishing linear combination Ai-Diti(Q)+- • ■+X m Du m (Q) = 
0. 

3.3. Convergence analysis for 8. Let us consider for this section a single sim- 
plex A. Intuitively, it is clear that the approximation S of S obtained by linear 
interpolation is quadratically good because of Taylor's theorem. We state more pre- 
cisely this result in the set-wise context we have adopted^] The distance between 
the sets A and B can be measured in terms of Hausdorff distance: 



d-u{A,B) :— max sup inf d(x,y), sup inf d(x,y) J . (3-5) 

XxEAVtB ' y eB x ^ A J 

Theorem 15 (Quadratic precision for £). Let Pq, . . . , P n be in general position, 
and such that Du has maximum rank. We denote by A — (Pq, . . . , P n ) the n-simplex 
which vertices are those points. Let U0i(x), . . . , w r (x) a selection of independent mi- 
nors of Du, and Wj(x) be the 1st order interpolation of the values of Wj on the nodes 
Pi. Assume is a regular value for a>i, . . . , uo r , that the zero levels of the Wj are 
transversal and that u>j(P;) ^ 0, for all i,j. Then 

£ = {wi(x) = 0} n • • • n {w r (x) = 0} , (3.6) 
S = {cDi(x) = o}n •• -n {w r (x) = 0}, (3.7) 

and there exists a constant C: 

d«(S,E)<a 2 , (3.8) 

where S > is the diameter of the simplex A. 

Proof. First of all, we notice that the Wk(x) are polynomials of the first derivatives 
of u, thus they are smooth in our hypotheses. Inductively, consider r = 1 and denote 
cu = cu!. By Taylor's theorem, 



w(x) = w(x) +0 (|a--P | 2 ) , 
| cu (a;) — w(x) 

for a suitable C > 0. Assume, without loss of generality, cu > on Pq, . . . ,Pk and 



i.e., 

(3.9) 

Ma;)-tu(:r)| ^C6 2 , 



cu < on Pfc+i, . . . , Pn- Let e := CS 2 . (See panel (a) of Figure |3.l[ ). Thus the zero 
levels of cu and cu are comprised between the ±e levels of cu, i.e., 

| a; € A w(x) = oj C ja; e A - e uo{x) sC e\ . (3.10) 

By the compactness of A, there exist xq £ {cu = 0}, x e € {cu = e}, such that, 

d n ({w = 0},{S> = e}) = \x Q ~x £ \ (3.11) 



10 General estimates on the accuracy of piecewise-linear approximations of implicitly defined man- 
ifolds are proved in 
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&={a=0] 



/ Cj=—e 



(a) 




Fig. 3.1. Critical simplexes, with representations of the critical set 9 and its first order approxd. 
Panel (a): two functions in two dimensions. Panel (b) and (c): two functions in three dimensions. 
In panel (b) the the thick line is the first order approximation 8, in panel (c) the critical set 9 is the 
curve of intersection of the two level surfaces u)i(x) = and 0)2(2;) = 0. 



and it holds that 

w(x e ) - w(x ) = |cD'(xo) ' {Xe -Xq)\ = 



dw 

dw 



Oo) 



\x e - xo\ 



(3.12) 



where w = ^ e _^°| . By means of an elementary linear algebra argument we have also 
that 



duo 

dw 



Oo) 



mm 

i=l,...,k, 
i — fc+l,...,n 



<v(Pi) - a)(PiO 



Pi - Pv 



=: B > 0, 



so we can conclude 



and eventually 



d n ({cu = 0} , {u> = o}) < a 2 



(3.13) 



(3.14) 



(3.15) 



Consider now r > 1, and assume inductively that the Hausdorff distance between 
the intersection of the zero levels of r — 1 transversal functions and the intersection of 
the zero level of the respective linear interpolations on an n-simplex is quadratically 
smaller than the simplex diameter. Thus we have 



S_ = {wi(x) = 0} n • • • n {cu r _i(ar) = 0} , 
S_ = {d5i(x) = 0} n • • • n {u5 r _i(x) = 0} , 



(3.16) 
(3.17) 
(3.18) 



If we consider one more function w r (x) on the linear space S_ , we are in the previous 
case, so there exists A > 0, 



d n (t- n {w r {x) = 0} , S_ n {d> r (x) = 0}) 



<A6 2 . 



(3.19) 
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By transversality of the cur, . . . , cu r , the fact holding for the linear space X_ holds 
also for the compact manifold with boundary X_ and the function cu r (see Lemma 



16 for the details). Thus there exists a B > such that 

d n (XL n {cu r (a;) = 0} , X_ n {w r (x) = 0}) BS 2 . (3.20) 

On the other hand, for the intersection of the zero levels of the transversal functions 
a>i, . . . ,cu. r _i on the linear space {uo r (x) = 0}, by the inductive hypothesis there 
exists C > 

d H {{uJr{x) = 0} n £_, {iv r (x) = 0} n E_) CS 2 , (3.21) 

so the thesis is proved by the triangle inequality. □ 

Lemma 16. Let E a manifold with boundary diffeomorphic to an n-simplex 
A, and cu : X — > R differ entiable and without critical points inside X. We have 
cu(x) = w(x) + 0{5 2 ), where u) is an affine approximation and S is the simplex 
diameter. Thus we have that 

d H {{w{x)^c} 1 {w(x)^c})^CS 2 , forallceR. (3.22) 

Proof. Let A X be a diffcomorphism, with £ > > 77 > 0. Thus we have, 
for all y € A, 

cu o ^j(y) = u> o ip(y) + 0(<5 2 ). 

For any y* in the zero level of cu o tp we can find a line segment [yi , y 2 ] , with yi being 
one of the nodes of A where cu o tp is negative and yi is a point on a face of A where 
on the forming nodes cu o tp is positive. By continuity there exists a point y on the 
line [2/1,2/2] where cu o tp is zero. 
Thus 

CUo^(y*)-cuo^(y) = tiotp(y*)-tiop(y) + 0(6 2 ) = o^-\y* - y\+0{8 2 ) , (3.23) 
which gives 

\y* - j/K CS 2 . (3.24) 

□ 



Note 17. The hypotheses of Theorem 15 are generic in the sense that they hold 



for a open and dense set of functions. In particular, is assumed to be a regular 
value for W\, . . . , cu r because the set of the singular values has zero measure (Sard's 



Theorem). See YM9,26,28 311 



Theorem 18. In the simplex A = (Po ; ■ • • ,Pn)> if @ * s the Pareto critical set 
and 6 is its linear approximation, there exists C > such that 

dn (0,9) ^ C6 2 . (3.25) 



Proof. The \j computed as described in Algorithm [T] are first order approxi- 
mations to smooth functions, apart from a measure zero set of points. Thus the 
conclusions of Theorem [15] apply as well to the intersection of X with the half spaces 
Xj(P) > 0. □ 
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3.4. Second order algorithm. In Algorithm [5] we describe how to extract the 
stable critical set 9 S , i.e., the set of locally Pareto optimal points, from the critical set 
9 determined in the first order algorithm [T] 



Algorithm 2 Second order algorithm for the stable Pareto critical set 9 S 
1: Consider a set of data points T> — {Pi, . . . , Pjv} and proceed as in Algorithm [I] 
2: for all Delaunay simplex A = (Pi , . . . , P< n ) in the tessellation do 
3: Compute the matrix of the second derivatives D 2 u on the nodes Pi Pi n 
4: On the vertices Q of 9, linearly interpolate the second derivatives D 2 u(Q) 
5: Compute a basis Wi, . . . , iu n _ m _|_i for ker Du(Q), and set H(Q) := w T 

(Ai(Q)5^i(Q) + ■ • • + \ m (Q)Dh^ n (Q)) • u,. 
6: Compute the eigenvalues a%, . . . , <r n - m +i of H(Q). 

7-. Cut out from 9 the sub polytope 9 S where o~k ^ for all k = 1, . . . , n — m + 1 
8: end for 

9: Compose a simplicial complex glueing together adjacent polytopes 9 S 



The second derivatives could be also approximated computing the finite differ- 
ences of the values of the gradients on the nodes of the n-simplex. Indeed, setting 

Vi = Pi-P , i = l,...,n, 

we have 

^ - (a^) u ^ ? sis; ■ S ^ ? < v »™ - v "^»^ - - p "»< ■ 

(3.26) 

Using this formula, the quadratic precision cannot be guaranteed for locating 
boundary points of the stable critical set. Furthermore, because the boundary faces 
belong to different simplexes, the estimated boundary points for 9 S would jump from 
simplex to simplex. 

On the other hand, the formula will be correct for discriminating the nature of 
inner stable critical points, without extra computations. Boundary simplexes can thus 
analyzed with second derivatives, allowing the computation of the boundary of 9 S . 

4. Applications. 

4.1. Two functions in two dimensional examples. A series of examples in 
two dimensions is presented below. Via global analysis one sees that, for structurally 
stable mappings, the Pareto critical set is a 1-dimensional manifold with boundary 
contained in S. Critical points can only be of one of the following types: 

1. fold, i.e., the mapping is locally equivalent to u\ — X\, u 2 = x\, 

2. cusp, i.e., the mapping is locally equivalent to u\ = x\, — x\X2 — \x\. 
Therefore, the branches of Pareto critical points are composed by folds, which intersect 
only pair-wise and at non-zero angles. Some local branches terminate in cusps, where 
the status of critical points can change from stable to unstable. Finally, images of 
folds and cusps do not intersect |7|[63]. 

Functions gradients are evaluated on a grid of regular triangles and the critical 
set 9 is estimated according to the first order algorithm. Boundary points are marked 
with black diamonds. The generalized Hessian is estimated on the nodes of the critical 
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set, computing second derivatives in the triangles where its index changes, allowing to 
estimate the position of the points separating stable from unstable branches. Cusps 
are marked by a black star, stable branches are colored in red, unstable branches in 
orange and finally non critical branches are gray. 

Example 1. Consider two negative definite quadratic polinomials. The critical 
stable set is a curve joining the two individual critical points. Other singular branches 
occur in outer regions of the domain. 

Ul {x,y) = -UYox 2 -0.98y 2 , 

u 2 (x, y) = -0.99(a; - 3) 2 - 1.03(y - 2.5) 2 . 



See Figure \4^T\ 

Example 2. This example is taken from \5t 



u\{x,y) = -y 

y-x 



u 2 {x,y) 



■> (4.2) 



x + 1 



The critical set is a single curve split in a stable and an unstable branch, while the 
separating point is a cusp. See Figure \4~J^ 

Example 3. In the following mapping there are two second order polynomials, 
one negative definite and the other indefinite. The outcome is an (unbounded) global 
Pareto front and a local unbounded front terminating in a cusp. 

ui{x,y) = -x 2 -y 2 , 

u 2 (x,y) = -(x-6) 2 + (y + 0.3) 2 , ( ~" > ' 

See Figure \4~^[ 

Example 4- The following mapping is composed by a quadratic polynomial and a 
bimodal function. The resulting singular set is composed by an unbounded branch and 
two loops. One of the loops is critical and forms a local Pareto front delimited by two 
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cusps, while the other loop is non critical. 

Ul (x, y) = -x 2 -y 2 ~ 4(exp(-(a: + 2) 2 - y 2 ) + exp(-(a: - 2) 2 - y 2 )), 

u 2 (x,y) = -(x-6) 2 -(y + 0.5) 2 . '"' ' 

See Figure \4-4\ 

4.2. Higher input dimension. Example 5. The following mapping demon- 
strates the capabilities of the method in distinguishing local and global features of the 
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Pareto set. A widespread optimal branch is surpassed by a local branch. The sharper 
branch is composed by an unstable part (orange) and a stable part (red) which is inter- 
rupted by noncritical insertions (gray). Nevertheless, as illustrated in figures 4-5(a-b) 
the two separate branches are properly recognized by the algorithm and moreover the 
transitions among critical/ non critical and stable/unstable intervals are detected. For 
comparison, the outcome of the application of a commercial implementation of normal 
boundary intersection by Das and Dennis is shown in figures 4-5(c-d)^^ The 
starting grid (green dots) was 10 x 20 x 10, and we considered 50 NBI subproblems. 
The sequence of NBI points is marked by black stars. For this particular problem, 
NBI tracks correctly the broad Pareto optimal branch, in the sense that it produces a 
parametrization of it. However, the smaller branch is missed, although some of the 
points of the starting grid were close to this critical zone. It is clear that point-wise 
strategies suffer at tracking the Pareto optimal set and fail at performing widespread 
trade studies, apart from small intervals where different fronts are far apart and do 



Applications of modeFRONTIER® arc a courtesy by E. Rigoni at ESTECO 
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not change status (from critical to non-critical, stable to unstable, and so on). 

pa = (0.0, 0.15, 0.0) T , 

Pl = (0.0,-l.l,0.0) T , 

(-IS) -0.03 O.Oir 
M = -0.03 -1. 0.07 
\0.011 0.07 -1.01, 

g(x,y,z,M,p,a) = \ —e ^ , l 4 -^ 

V cr 

f(x, y, z) = g(x, y, z, M,p Q , 0.35) + g(x, y, 0.5z, M,px, 3.0), 
ul(x, y, z) = ^-x + -~f(x, y, z), 

u2(x,y,z) = -^y x + ^y.f( x ,y,z). 



Example 6. The following 6 dimensional example is a regularization of the third 
of the ZDT problems VM, which has degenerate second derivatives. The Pareto fronts 
of original and modified problems correspond each other in output space. We used 
a Delaunay tessellation defined on 300 randomly generated points. The results are 
presented in figure \4-(>\ Critical and merely singular branches are correctly represented. 
Note that the critical branches are correctly marked as unstable, being minima. 

ul(xi, ...,x 6 ) = xi, 

u2(xi, . . . , xq) = 1 — yfx\ — x\ sin(107ra;i) + x\ + • • ■ + x\, (4-6) 
x x G [0.1,0.425], x 2 ,...,x 6 G [-0.16,0.16]. 



4.3. Three functions examples. Example 7. The simplest nontrivial non de- 
generate example we can build in the three dimensional case is composed by three 
negative definite 2nd order polynomial functions fj(x),j — 1,2,3. Additionally, we 
introduce a small non polynomial perturbation. 



fj{x) = (x 




(x-Cj), j = 1,2,3, 



ft COS (^( 



y) 
y) 



J 



(4.7) 



Where x = (xx, x 2 , X3) G 



> 0, i,j = 1,2,3, Ci,C 2 ,C 3 G 1R 3 are distinct, 



non collinear points, while f3j , jj are real numbers. In the generic case the singular 
set is an hypersurface ofM 3 , while the critical set 9, which is stable, is diffeomorphic 
to a triangle, i.e., 9 is a compact connected manifold with boundary and three corners, 
corresponding to the minima of the three functions Ux,U2,U3. See Figure \4~^ a) ■ 
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(c) (d) 

Fig. 4.5. Example [5| Panel (a): singular (gray), Pareto critical (orange) and Pareto stable 
(red) sets in the problem domain. Green dots mark the nodes of the starting regular grid defining the 
tessellation. Octahedrons mark points separating critical and non critical branches. Spheres separate 
stable from unstable branches, i.e., mark cusps. Panel (b): image of singular and Pareto sets. Dia- 
monds separates critical from non critical branches while stars mark the cusps. Panel (c)-(d): results 
obtained running the commercial implementation of NBI-AFSQP available in modeFRONTIER® , 
courtesy of E. Rigoni. Small green points are a starting regular grid, while marked points are the 
solutions of the 50 NBI subproblems. 



Example 8. We break the convexity of the previous example by adding a secondary 
maximum to the first function. We define a further negative definite, 2 nd order poly- 
nomial fi(x) and set u(x) as: 





(4- 
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The main portion of the Pareto set is slightly deformed while a new branch appears. 
In Figure \4~^ b ) is shown the resulting Pareto critical set 9 obtained by iterative ap- 
plication of Algorithm^ as described in section^ 



4.4. A constrained example. We briefly sketch here an adaptation of Algo- 
rithm [T] to the case of equality constraints. Next we illustrate a simple application. 

— > R n_d is a smooth function such that 



Let W := |.t e R n g(x) = oj, where g 



Of) 



has maximum rank. 
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Algorithm 3 Equality constraint case for the 1st order algorithm 

l: Determine a piecewise linear approximation W of W, with nodes P\, . . . , P n 
2: for all simplex A in the tessellation W, do 

3: determine a piecewise linear approximation of the singular, critical and stable 
sets possibly crossing the simplex. This is done on the basis of the projections 
of the gradients of the UjS on the tangent space to W. In principle, a basis for 
TW should be chosen respecting the orientation. 

4: for all Node P of the simplex A do 

5: compute Dg(P) and project graduj(P) on ker Dg(P) via iTDg(p)- 

6: compute an independent set of minors for the matrix 

(ir grad u\ (P) , . . . , tt grad u m (P)) or, equivalently, 
7: compute an independent set of minors for the matrix 

(grad gi, . . . , grad# n _ d , grad m, . . . ,gradu m ) 
8: end for 

9: determine if all minors vanish inside the simplex S, and in that case locate E 

via inverse linear interpolation 
10: estimate Aj and determine the critical set 9 as in Algorithm [l] 
11: end for 

12: eventually glue together adjacent portions of E and 9. 



Example 9. Maybe the simplest example of constrained problem is when W = § 2 
and the objectives are the first two coordinates, ui(xi, x%, X3) := x\, U2{x\, X2, X3) := 
3?2p1 Explicit algebraic computation give that the singular set E is the equator of the 
sphere, where the two curvilinear segments where X\X2 > are the critical set 9, as 
illustrated in figure 4..8(a). By applying algorithm^ we start by approximating the 



sphere by an icosahedron. At every node P — {x\, x%, x%), | [x\ + x\ + x\ — l) = 0, 
we have Dg(P) — [x\, X2, X3), therefore the projections of the gradients of Uj are 
(1 — x\, —X1X2, — X1X3) and (— X\X2, \ — x\, —2:1X3). The singular set E passes through 
the triangles where the pair of vectors n grad U\ and ir grad U2 change orientation in 
the tangent plane to S 2 . It is equivalent then to compute the determinant of the matrix 
which columns are gradg, gradui and gradu 2 md to determine the line along which 
it vanishes. This gives exactly the "equator" of the icosahedron. The signs of the 
Xj depend on the sign of the scalar product among n grad Uj , again giving as turning 
points the intersections with the axes. The results are summarized in figures \4-8j( a) 
and (b). 

5. Iterative schemes. The previously presented approach defines an approxi- 
mation of the Pareto optimal set given any distribution of points in the domain. Here 
we propose and discuss an iterative scheme. At every step a selection of points from 
the approximated Pareto optimal set is added to the dataset V, the gradients in the 
new points are evaluated, the tessellation is updated and a refined approximation of 
the Pareto set is built. The desired effect is obviously to get closer and closer to the 
actual optimal set, but an efficient strategy should produce an as uniform as possible 
discretization of the optimal set. 

A naive approach would suggest to insert in the set of the candidates for evaluation 
all of the nodes of the complexes, i.e., all the stable admissible vertices computed and 
all of the boundary points, both for criticality and stability. Nevertheless, a glance at 



This example is also discussed in [12| 



GLOBAL ANALYSIS AND MULTIOBJECTIVE OPTIMIZATION 



21 




Xl 



X i a. 




(a) 

Fig. 4.8. (a): Singular and critical sets determined analytically for example^ (b): piecewise 
linear approximation of the sphere, of the singular set (green solid curves) and of the critical set 
(orange and thicker curves) 



the examples of the previous section reveals that the sizes of the optimal complexes 
cover a wide distribution, in particular the patches result very small if £ passes close 
to tessellation nodes. Moreover, little experience shows that large patches are reduced 
sensibly slowly if none of their internal points is introduced. With this criteria in mind 
we introduce an iterative scheme for the case of two functions. 

5.1. Two functions iterative scheme. In the two functions case the Pareto 
optimal set is a one dimensional manifold with boundary, i.e., a collection of curved 
intervals. The discrete approximation is a collection of polygonal curves. For every 
interval a sequence of candidate points equally spaced along the polygonal curve is 
extracted. The number of points is chosen equal to the number of segments, so that 
approximately every triangle containing optimal points is split as close as possible to 
the optimal set. 

5.2. Higher number of functions. It seems reasonable to take into account 
of the stratified structure of 9 in the design of an iterative strategy. In fact, strata 
should be filled as uniformly as possible, where the uniformity is determined according 
to the fc-dimensional measure, if k is the dimension of the stratum. So, taking for 
instance the situation of example [7J corners' approximations are re-evaluated at each 
iteration, uniformly spaced points are taken along boundary lines, exactly as in the 
two functions case, while internal points should be distributed proportionally to the 
area of the triangles and polygons composing 9. This is more difficult to be defined 
precisely. Indeed, the problem of uniformly filling a general n-dimcnsional region is a 
long-time crucial issue for statistical applications [46] . Furthermore, in our problem 
we have to fill uniformly a general rt-dimensional manifold, thus we have somehow to 
take into account of the effects of the curvature on the measure of the volumes. 

Taking inspiration from a Design of Experiments strategy called maximin distance 
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Algorithm 4 Uniformly filling a simplicial complex 
l: Tessellate in simplexes the polytopes of the mesh 
2: Build the adjacency lists of the simplexes 
3: Evaluate the volume of each simplex 

4: For every simplex define the accumulated volume as the sum of its volume and 

the volume of the adjacent simplexes 
5: Pick the simplex with the maximum accumulated volume 
6: Add to the candidates stack the center of mass of this maximal simplex 
7: repeat 

8: Recompute the accumulated volumes excluding the already picked simplexes 
9: until the desired number of candidate points is collected 



design 20 we proceed as described in Algorithm [4] This algorithm, because of point 
6, can lead to long and thin simplexes and to numerical instabilities when iterated 
many times. This problem can be tackled by the application of mesh improvement 
strategies, as described below for the case of two dimensional domains. However, to 
the author's knowledge, general dimension mesh improvement strategies are still not 
available at present. 

5.3. Stopping criteria. Analogously to gradient based methods of single func- 
tion optimization (nonlinear conjugate gradient, Newton and Newton-like methods), 
a stopping criterion could be based on the magnitude of the minors Mi , . . . , M r com- 
puted in the points of the last iteration. The magnitude of the minors is analogous 
to the magnitude of the gradients for single objective optimization. 

In fact, we could define a different iterative strategy taking the rule of subdividing 
only stable critical triangles contained in simplexes where the minors are larger than 
a prescribed threshold. 

5.4. Application. We show the behavior of the iterative scheme described above 
applied to the mapping in Example [4j At each iteration we generate a number of 
evenly spaced points along the approximate stable Pareto critical set. In order to ex- 
hibit the claimed quadratic convergence, it is necessary to sample the approximated 
optimal set by quadratically finer intervals, i.e., comparable to the precision gained. 
As a result the density of points will grow exponentially w.r.t. the number of iter- 
ations. Such a density of points rapidly deteriorates the mesh quality, i.e., skinny 
triangles suddenly appear leading to numerical instability. Thus, at each iteration, a 
number of extra nodes (namely, the circumcenters of the most skinny triangles) should 
be introduced in the mesh in order to produce a nicely grading. At this extent we 
have coupled our method with Ruppert's algorithm, as implemented in the triangular 



mesh refinement software triangle by J. R. Shewchuk 51 52 . 

Already at the fifth iteration the triangulation starts to suffer from numerical 
instability, thus we consider 9 S generated at the fourth iteration as the optimum 
and evaluate the Hausdorff distances between 6 y s ' and 9 K S , for i = 1, . . . , 3. As in can 



be seen in Figure 5.2 panel (a), the Hausdorff distances between the approximated 
Pareto sets and the numerical optimum converges superlinearly. For reference also the 
convergence behavior of the maximum and the mean minors magnitude are reported. 

In Figure |5.1| is illustrated how the triangulation and the representation of the 
Pareto set evolves from one iteration to the subsequent. 



In Figure 5.3 the three dimensional problem of Example [7] is tackled by the 
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procedure described in Algorithm [4] The algorithm has been applied by introducing 
only a small number (~10) of new points on the sites with the largest magnitude of 
the minors. In such a way it was possible to iterate 70 times the scheme reaching a 
very small magnitude for the minors. 

Because of the mentioned exponentially growing number of samples necessary to 
exhibit quadratic convergence speed for the iterative scheme, the experiment described 
for the two dimensional case becomes prohibitive in three dimensions. 

On the other hand, a superlinear precision can be verified as well by means of 
a sequence of approximations obtained from a progressively finer regular meshes, 
corresponding to a sequence of mesh sizes s = 2.8, . . . , 0.4. Because the Hausdorff 
distance among the first s-approximation and the optimal set is already comparable 
to the largest mesh size of the optimal set, we analyze the sequence of average distances 
between a point of one set from the triangles of the other set, instead of considering 
the maximum distances. These average distances decrease faster than linearly as it 
can be seen by plotting the ratio of distances and mesh sizes versus the mesh sizes, 



as reported in Figure 5.2 panel (b) 



6. Conclusions and perspectives. We have presented a novel multiobjec- 
tive optimization method which exploits the manifold structure underlying the set 
of Pareto optimal points. Global analysis seems the proper setting where those struc- 
tures arise and can be studied. We approximate Pareto sets via simplicial complexes, 
specializing simplicial pivoting techniques for detecting the singular manifold E and 
successively cutting out critical and stable subsets 9 and 9s- By contrast, most of the 
available strategies are aimed at producing a scatter of optimal points which images 
should be evenly distributed. We have illustrated some generic situation where this 
program could not be successfully completed via such point-wise strategies, because 
of nonconvexities of the functions. Adopting the Hausdorff measure, Newton-type 
estimates lead to quadratic convergence in a set-wise sense. 

Because of its global character, the method proposed here is demanding. Delaunay 
tessellations, in particular, are defined for every possible input dimension, but are 
numerically workable only for small dimensional cases. The theory of singularities of 
mappings also highlights further limitations encountered when dealing with a large 
number of functions. Lastly, we have everywhere assumed the differentiability of 
the functions. Therefore the method is not suitable for non-smooth optimization, 
instead it is supposed to be applicable also via smooth surrogate functions, when 
approximations are consistent with the functions at hand. Possible extensions of the 
algorithms described in this paper are conditioned by the issues enumerated below. 

6.1. The curse of dimensionality. The first problem one encounters when 
trying to apply this algorithms to industrial strength problems is the limitations to 
the input dimension. The whole procedure is based on a Delaunay tessellation of 
the input domain, which complexity grows exponentially with dimension. As pointed 
out for instance in the qhull documentation 18], building the convex hull of a 9- 
hypercube is computationally exhaustive. Analogous limitations are encountered in 
global optimization, where the search for optima in high dimensional domains cannot 
realistically be performed on real case problems. Indeed, typically, global search 



algorithms are rarely tested and compared over dimensions larger than 5 (see 21 



23 39 48 50 68 . This problem is structural and cannot be resolved by augmenting 
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Fig. 5.1. Iterative scheme for the mapping in Example\l\ 
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Fig. 5.2. Convergence behavior for iterative schemes applied to Examples^ and^ Panel 
(a): Iterative scheme applied to Example^ Stars represent the Hausdorff distance between the 
approximated Pareto set at each iteration and the Pareto set obtained at the 4~th iteration, which is 
employed as an optimum. Diamonds and circles represent respectively the maximum and the mean 
absolute value of the minors of the Jacobian matrix computed on the points of the approximated 
Pareto set. Log scale reveals the superlinear convergence behavior. Horizontal dashed line represents 
the mesh size of the numerical optimum. Panel (b): Algorithm [7] applied to Example \l\ using 
progressively finer regular meshes. Stars represent the average distance between a node of the optimal 
set and the triangles of the approximation and viceversa. The ratio between the distance and the 
the mesh size decreases faster than linearly according to Theorem \ 1 5| 



the computational resources. Therefore, the presented algorithms are best suited for 
low dimensional problems. In fact, the curse of dimensionality is a strong motivation 
for reflecting carefully on the necessity of introducing extra input variables when 
tackling new problems and designing experiments. A possible exit strategy could be 
screening the input variables 47 57 . This practice can be surprisingly successful, 



because usually sparsity of effects occurs, revealing a pronounced hierarchy among 
input variables, leading to sensible simplification of the problem formulation. p*| 

Alternatively, as described in the recent paper [6], it is possible to redefine any 
problem in n-dimensional space in an equivalent problem in a linear subspace of di- 
mension 2(m— 1) + 1, if m is the number of objectives. This is because the singular set 
is an (m — l)-manifold, and by Whithney's embedding theorem, in the compact case, 
almost all projections on linear 2(m — 1) + 1 dimensional subspaces are diffeomor- 
phisms. This would mean that bi-objective problems could be equivalently discussed 
in a 3 dimensional domain, 3 objectives would require only 5 input variables, and 
so on. This would dramatically reduce the computational burden of the tessellations 
involved. 



13 The sparsity of effects is an empirical law stating that in a generic physical experiment one 
usually observes that the 80% of the effects are due to the 20% of the factors. Related phenomena 
are that the first order contributions are the most important, while higher order contributions decay 
fast. Finally one observes that the largest interactions (second order contributions) are combination 
of the strongest factors. See for instance [671. 
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Mesh size = 0.7 

Fig. 5.3. Algorithm[l\ applied to Example\7\for progressively finer meshes. The blue surface is 
the numerical optimum obtained with long application of the iterative scheme. 



6.2. Surrogate models. In industrial applications, when the objective func- 
tions at hand could be non differentiable, or may be computationally too expen- 
sive, preventing the computation of derivatives, we figure that the applicability of 
the algorithm proposed here will be significantly extended by using surrogate mod- 
els. There exists an extensive literature developed in recent years on this subject 



(see 21 22 45 46 and the references therein), also with specific applications to mul- 
tiobjective optimization [24j[25] 

The procedures of this paper can be adapted applying the Algorithms [I] and [2] 
to a surrogate model u fitted to the values of the true functions u computed on the 
given data points. On the outcoming candidate points, new evaluations of u are to be 
computed, and a new surrogate model fitted to the increased dataset. This reduces the 
computational effort for computing derivatives and furthermore prevents premature 
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stopping of the optimization process due to accidental failure of function evaluation 
at some data point. Again, the convergence to the Pareto sets of the true functions 
is guaranteed via global analysis. 
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